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ABSTRACT 

Some  simple  derivations  of  two-filter- like  formulae  (in  the 
smoothing  problem  of  linear  estimation)  are  given  for  general 
nonstationary  process,  it  then  becomes  clear  how  a  wide  sense 
Markovian  assumption  is  required  to  give  the  formulae  a  back¬ 
wards  filter  interpretation. 
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SIGNIFICANCE  AND  EXPLANATION 


The  estimation  of  one  process  from  measurements  on  another 
related  process  is  a  problem  that  arises  in  many  areas  such  as  Time 
Series  Analysis,  Econometrics,  Communications  Engineering  and 
Control  Engineering.  Particularly  in  the  Engineering  applications 
there  is  a  great  interest  in  various  computational  forms  of  the 
algorithms  proposed  to  solve  the  above  problem.  The  aim  in  this 
article  is  to  give  simple  derivations  of  some  of  these  algorithms 
thus  revealing  how  they  apply  to  general  nonstationary  processes. 
This  facilitates  an  understanding  of  what  minimal  assumptions  are 
needed  for  their  full  utility. 


The  responsibility  for  the  wording  and  views  expressed  in  this  descriptive 

summarv  lies  with  MRC,  and  not  with  the  author  of  this  report. 
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SMOOTHING  ESTIMATION  OF  STOCHASTIC  PROCESSES 
PART  II:  TWO  FILTER  FORMULA 


V.  Solo 

1.  INTRODUCTION.  Recently  a  number  of  authors  have  discussed  various  types  of 
smoothing  formulae  in  varying  levels  of  generality  (so  far  as  the  signal  and  observed 
process  models  are  concerned):  see  Kailath  and  Frost  [G) ,  Ljung  and  Kailath  [9], 
Lainiotis  (8) .  An  ongoing  problem  has  been  the  understanding  of  the  two-filter  formulae 
Mayne  [10],  Fraser  [3],  Mehra,  Badawi  et.  al.  [1] .  -^ln  this  article  these  two-filter 
results  and  some  new  ones  are  derived  in  a  simple  way  in  a  very  general  setting  (for 
arbitrary  nonstationary  processes) .  It  turns  out  however  that  only  if  a  wide-sense 
(i.e.  second  order)  Markovian  assumption  is  added  can  one  of  the  filters  be  viewed 

as  a  backwards  filter.  The  remainder  of  the  paper  is  organized  as  follows.  Section 
2  recalls  some  smoothing  formulae  that  apply  to  both  continuous  and  discrete  observations 
Section  3  discusses  two  types  of  two-filter-like  formulae  for  general  nonstationary 
processes.  In  Section  4  one  of  the  filters  is  shown  to  be  a  backwards  least  squares 
estimate  provided  a  wide  sense  Markovian  assumption  is  satisfied.  Section  5  contains 
a  derivation  of  some  backwards  filters.  In  Section  6  some  additional  two-filter-like 
formulae  are  given.  The  final  section  is  a  conclusion.^'- _ , 

2.  PRELIMINARIES .  Consider  the  linear  estimation  of  an  n^  dimensional  process 

x(t)  from  measurements  on  a  related  n  dimensional  process  y(t).  In  the  first 
-  y  - 

instance  suppose  ^(t)  is  measured  in  discrete  time  at  points  0  <  t^  <  t_,  <  . . .  <  tr¬ 
over  an  interval  [0,T];  collect  these  observations  into  a  vector  ('  and  assume  the 

— J T 

covariance  matrix  Efif^  IJJJ  is  positive  definite.  Now  for  a.iy  t  in  [0,T]  is 

comprised  of  two  vectors  tj_ U ^  consisting  of  the  data  over  the  intervals  [0,t] 

and  (t,T]  respectively.  Let  us  denote  the  linear  least  squares  predictor 
by 

x(t|T)  or  E(x(t )  1^,) 

where  E  denotes  wide  sense  conditional  expectations  or  projection.  (see  Parzen 
[12,  p309J;  Doob  [2,  pi 50 ] ) .  Now  ?c  ( t  j T)  is  defined  by  the  orthogonality  condition 
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E(x(t)  -  X  ( t  T)  )  »’  =  0  . 
—  -  *-T  - 


Thus 


,-1 


x(t,T)  =  E(x(t)|i^)  =  i/J,)  . 

Also  denote  and  .  Then  observe 

x(t  jT)  =  E(x(t)  i^,) 

=  E(x(t)  i/^) 

=  E(x(t)  1^,  <^T) 

=  E(x(t)  ji^)  +  E(x(t)|£tT) 

=  x(t|t)  +  y_(t  j  T) 

>5ow  define 

P(t|T)  =  E  (x  (t)  -  x  (t  j  T) )  (x  (t)  -  x  (t  |  T) )  1 


by  orthogonality 
say. 


and  denote  P(t)  =  p_(t  j  t )  .  Then  call 

-1 


X_(t  |  T)  =  P  (t)Y(t|T)  . 

Thus  we  can  write 

X  (t  j  T)  »jc(t|t)  +£(t)X_(t|T) 
Also  observe  that 


(1) 


Etx(t|t)_A'  (t|T)  ]  =0 


(2) 


(3) 


So  we  can  then  find 

P(t)  =  E (x (t)  -  x(t|t))(x(t)  -x(t|t)' 

=  P(tjT)  +  p(t)0(t  |T)p(t) 
or  P(tjT)  =  P(t)  -  p(t)£(tlT)P(t> 

where  we  have  introduced 

£(t  |T)  =  E[/_(t|T)£'  (t }  T)  } 

Equations  (l)-(4)  describe  general  continuous-discrete  estimation  formulae. 

;."ow  suppose  £(t)  is  measured  continuously  over  the  interval  [0,T].  Assume  £(t) 
has  finite  variance  and  a  positive  definite  covariance  Kernel.  (Thus  in  a  signal  plus 
noise  model  we  are  writing  d^(t)  =  s(t)dt  +  dw(t).) 


(4) 
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Now  use  the  symbol  ij  to  denote  the  Hilbert  space  spanned  by  y(o),  a  in  [0,T]  with 
inner  product  (  u,v  >  =  E(uv)  (for  random  variables  u,v  <  ij^) .  That  is  tj  consists 
of  all  random  variables  that  are  finite  linear  combinations  (of  the  form  T. •  a!  y(t.), 

V  in  10 , T] )  of  for  c  in  [0,T1  or  limits  in  mean  square  of  such  linear 

combinations,  also  call  ij j  the  Hilbert  subspaces  spanned  by  ^_(o)  ,  o  in  [0,t), 
[t,T]  respectively.  The  linear  least  squares  estimate  st  (t  | T)  is  the  vector  whose  ith 
component  is  the  unique  projection  E (x^ (t) | y^)  that  satisfies 

E(Xi(t)  -  E(x  (t>  |yT))x'  (a)  =  o  ,  for  all  o  in  [0,T] 
with  a  slight  abuse  of  notation  denote  x_( 1 1 T)  by  E (x^( t )  |  j/T) .  Denote  by 
l/tT (”  "E (i/tT|  i/fc) ")  to  be  the  Hilbert  subspace  of  ij  spanned  by  E(^(s)|i/t)  s  in 
[t,T].  Then  y  is  the  ortho9°nal  complement  of  y  in  y^  namely 

the  Hilbert  subspace  of  spanned  by  ^(s)  -  E(£(s)|(/t>,  s  in  (t,T). 

Then  observe  exactly  as  before 
x(t|T)  =  E(x(t)  |  t/T) 

=  E(x(t)  j  yt>yt7) 

=  Etxtt)  | yt,ytT) 

=  E(x(t) I yt)  +  E(x(t) \ytr) 

=  x(t|t)  +  y_(t  |  T)  . 

With  the  same  definitions  as  before  it  easily  follows  that  (l)-(4)  hold  also  in  continuous 
time.  These  relations  have  been  previously  given  by  Kailath  and  Frost  [6]  for  continuous 
time  processes  possessing  an  innovations  process.  See  also  Kailath  and  Geesey  [7], 

Since  much  of  the  ensuing  argument  depends  only  on  (l)-(4)  the  discrete,  continuous- 
discrete,  and  continuous  cases  can  be  given  a  joint  treatment. 
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3.  TWO-FILTER  FORMULAE.  (A)  The  basic  idea.  The  idea  of  a  two-filter  formula  is  to 


compute  the  smoothed  estimate  at  time  t  as  a  sum  of  two  estimates;  one  using  data  in 
(0,t)  the  other  using  data  in  (t,T).  If  these  estimates  are  least  squares  they  ought 
to  be  orthogonal  in  some  sense.  Thus  the  formula  could  be  of  the  form 
x(t | T)  =  P (t | T> (P_1(t)x!t|t)  +  P"1 (t |T) x_ (t  S  T) ) 
where  the  subscript  B  denotes  backward  and 

Pg<t|T)  =  E[x(t)  -  XgltlT)]  Ix*(t)  -  Xg  (t  |  T) ) . 

Of  course  as  pointed  out  in  (2)  two  terms  in  the  basic  decomposition  (1)  have  an  ortho¬ 
gonality  property  but  just  what  the  "backwards"  orthogonality  should  be  in  general  is 
not  clear;  there  are  indeed  two  possibilities 
E(x(t)  -  XgttlTJJx'  (t)  *  0 
or 

E(x(t)  -  Xg (t  | T) )  x^  (t  [ T)  =  0  . 

It  turns  out  that  both  these  lead  to  satisfactory  expressions.  The  required  two -filter¬ 
like  formulae  will  be  obtained  by  reorganizing  the  basic  formula  (1). 

(B)  The  first  two-filter  form.  First  apply  the  Matrix  Inversion  Lemma  to  (4)  to  see 
p'VlT)  =  P_1(t)  +  (£-1(t|T)  -  P(t))"1 .  (4a) 

So  define  P^(t|T)  (p  stands  for  reverse)  by 

(T1*  t|T)  =  P  (t  |  T)  P(t)  (5) 

(notice  that  as  t  -*■  T  ,  £(t|T)  -*■  _0  so  P^(tjT)  -*  »°)  so  that  (4a)  is  rewritten 

p-1(t|T)  =  P_1(t)  +  p-1(t|T).  (6) 

—  '  —  -P 

Now  multiply  this  through  (1)  to  find 

p"1(t|T)x(t|T)  =  P-1(t)x(t|t)  +  pfp  (t  |  T)  (x(t'lt)  + 

P  (t  |T)  (P_1  (t)  +  P-1  (t '  T>  )  P  (t)  X  (t  It)  )  ) 

— p  —  —  c  —  — 

=  p-1(t)x(t|t)  +  p"*1  (t  |T)  (x(t  (t)  +  (P  _  (t ;  T)  +  P(t))£(t|T)) 

~~  ““p  i- 

=  p"1(t)x(t|t)  +  p”1  (t  i  T)  (x(t|t)  +  0'1(t(T)£(t|T)) 

by  (5) 
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We  are  thus  led  to  introduce 


xJt|T)  =  x(t|t)  +  £_1(t|T)<_(tjT)  f7a) 

so  we  can  write 

x  (t  I T)  =  P(t|T)  (P-1(t)x(t|t)  +  p”1  (t  |  T)  x  (t  ]  T) )  (7b) 

and  calculate 

E (x (t)  -  x  (t  It)  )  (x\t)  -  x'  (t  |t)  ) 

—  ~ P  “  —P 

=  E(x(t)  -  x(t|t)  -  0“1<t|T)Mt]T) )  (x’  (t)  -x’(tjt)  -  £_1(t,T)_(t  T))  ’ 
=  P(t)  -  E(x(t)_A*  (t{T))0_1(t]T)  -  £-1(t|T)E(_X(t|T)x’  (t)  )  +  0~1(t,T) 
in  view  of  (2)  and  the  definition  of  l?(t) . 


Next  observe 

E(x(t)A_*  (t  |  T) )  =  E[x(t)  (x(tjT)  -  x(t)) ']P-1(t) 

=  -<P(t|T)  -  p(t))p_1(t) 

=  +  P(t)()<t|T!  by  (3)  (8) 

Thus  the  above  expression  becomes 

E(x(t)  -  x  (t'|T)  (x(t)  -  x  (t|T))' 

_  -p  —  — P 

=  P(t)  -  P(t)  -  P(t)  +  £_1  (t  | T) 

=  Pp  (t  j  T)  by  (5)  .  (9) 

Also  observe  that  (t.  |  T)  satisfies  the  following  orthogonality  properties 

E(x(t)  -  x  p(t|T))x' (t)  =  0  (10a) 

E(x(t)  -  x  (t  | T)  )x'  (t  1 1)  =  0  .  (10b) 

The  first  follows  since 

E(x(t)  -x  (t  j  T) )  x 1  (t)  =  E[x(t)  -  x(tjt)  -  r_1(t|T)_A(t|T)  )x' (t) 

P 

=  P(t)  -  0-1(tjT)£(t  jT)P(t)  by  (8) 


while 


=  0 


E(x(t)  -  X  U[t))x'  (t  1 1)  =  E(x(t)  -  x(t|t)  -  £‘1(t|T)^(t:T))x'  (t,t) 

=  0  -  0 
=  0 


m 
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Equations  (~a),  (71-)  ,  (9),  (5),  (6),  (10)  describe  pseudo  two-filter  formulae  for  the 

smoothing  i robiera.  (Note  that  these  results  have  also  been  obtained  recently  by 

Badawx  et.  al.  'll  for  a  State  Space  Model  by  very  elaborate  argument.)  The  descriptor 

"pseudo"  refers  to  the  fact  that  it  has  not  been  demonstrated  whether  x  (t I T)  can  be 

-  P 

computed  by  a  backwards  filter.  This  point  is  discussed  in  Section  4  below.  First 
however  we  investigate  the  estimate,  call  it  x,(tjT),  that  satisfies  the  other  type  of 
backwards  orthogonality 

S(x(t)  -  x  (t ; T)  )x*  (t  |T)  =  0.  (11) 

(C)  The  second  two-filter  form.  Let  us  denote  _r(t)  =  E  (jt  (t)  x_‘  (t)  )  and  look  for 
x  .(t,T)  in  the  form 

x  _(t  j  T)  =  M(t)  x  (t  | T) 

where  :-!(t)  is  to  be  chosen  to  ensure  (11)  holds. 


Consider  then 


implying 


0  =  E (x (t)  -  x  s(t|T))x)(t]T) 

=  E  (x ,  t)  -  M (t)  x  (t  |  T) )  xf  (t  [ T)M*  (t) 
—  —  —  p  “T>  — 


M(t)  =  E [x (t) x* (t  T) ] E (x  (t | T) x' (t  T) ) 

—  —  —  p  —  p  —  o 


so  that  we  have  the  interesting  interpretation 

x  .( 1 1 T )  =  E(x(t)  |x  (t  |  T) )  . 

E  P 

To  find  a  more  informative  expression  for  M(t)  continue  with  (10a)  ,  (10b)  which  imply 
E  (x  (t)  x'p(t  |  T) )  =  E ( Jt  ( t )  jc'  (t) )  =  jr_(t)  (12a) 

-P  ( t  j  T)  =  E  (x  (t)  -  x  ,(t|T))x'  (t  |  T) 

f  P  P 


E(x  (t|T)x*  (t  j  T) )  =  ^_<t)  +P.c(t|T). 

Thus  X  (t  '  T)  =  2(t)  (£(t)  +  P  _(t  |T) )  _1>1  ft  Jt)  . 

Now  we  can  introduce 


P  (t  T)  =  E(x(t)  -  x  Jt  ]  T) )  (X*  (t)  -  x'£(t  |T) ) 
=  E(x(t)  -  x  ,(t  |t)  )x'  (t) 


=  -(L)  -  -  (t)  (- <t)  +  P  (t ,  T)  )  tt  (t) 
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Thus  invoking  the  Matrix  Inversion  Lemma 


p"*rt|T)  =  +  EZ1<tlT>  •  (13) 

Observe  that,  from  (5)  ,  as  t  -  T  p  (t  i  T)  “  while  P  ft  I T )  -*•  n(T).  Also  from  (12b) 

—  e  —  £  — 

Etx  .(t  |T)x'_(t|T) )  -  0  as  t  -  T 

implying  that  the  initial  condition  for  the  computation  of  x  ft  | T )  is  x  _(t|t)  =  0. 

—  P  —  D  — 

To  calculate  £(t|T)  using  jc  g(t|T)  return  to  (13)  and  find  via  (12c)  that 
P^1  (t  { T)  x  (t  |  T)  =  p-1(t|T)  (it  (t>  +  P  (t  |T) )  n”1  (t)  x  a(t|T) 

=  (P_1  (t  |  T)  +  7I-I(t))x  ,(tjT) 

—  pi  —  —  D 

=  p"g(t|T)x  g(t|T)  by  (13).  (14) 


The  interesting  "invariance”  expressed  in  this  relation  explains  some  of  the  confusion 
with  the  two-filter  formulae.  T6  summarize  we  collect  some  of  the  expressions  together 


x  (t  |  T)  =  x(t|t)  +  P(t)i_(t|T)  (1) 

=  P(t|T)  (P~1(t)x(t|t)  +  P_1(t|T)x  (t  |  T) )  (7b) 

P  P 

=  P(t|T)  (P-1(t)x(tjt)  +  £."g(t]T)x  g(t  |T) )  (15a) 

=  P(t|T)  (P*1(t)x(t[t)  +  (P_p(t|T)  +  TL‘1(t))x  g(t|T)) 

by  (13).  (15b) 


Also 

p-1(t|T)  =  P’V)  +  P_p(t|T).  (6) 

Expression  (15b)  was  given  by  Ljing  and  Kailath  (9,  pl55J  for  a  state  space  model. 
With  (11)  in  mind  we  turn  now  to  consider  when  x  ft [ T)  can  be  computed  by  a  backwards 

P 

filter  i.e.  when  is  j<  it|T)  a  backwards  estimate  of  !<(t)  based  on  the  data  in 
[t,T]. 
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4.  |AClWA^S_FILTERS_AND=A_M^0VIAt^SSUMPTI01Ji  Here  we  investigate  under  what 
conditions  xa(tjT)  is  the  linear  least  squares  estimate,  ( t  j T ) ,  of  £(t)  given 
the  data  £  in  (t,T).  Since  the  least  squares  estimate  is  unique  we  can  establish  when 
it  is  also  x  „< 1 1 T)  by  ensuring 

E(x(t)  -  )c  g(t  |  T)  )£*  (s)  =  £  for  all  observation  points  s  in  !t,T]. 

(16) 

First  we  find  a  convenient  expression  for  x  (t|T).  Since  x  JtlT)  is  to  be  a 
backwards  estimate  we  can  expect  a  backwards  decomposition  analogous  to  (1)  say 

(17a) 

—  B  ~  B  ' 

0. 


x(tjT)  =  X  (t  |  T)  +  P  (t  |  T)  X  (t  |  T) 
“  P  —  D  “  t 


(17b) 


-(t))  ■  (t  i  T> )  ’ 


with  E(x  gtt  jT)X_^(t]T) )  _ 

This  is  indeed  possible  with  (See  Appendix  A) 

£e(t|T)  -  i-1(t)  tx(t|t)  +  (P(t)  -  _n(t)  )Ht  |  T)  ] 

=  7L_1(t)  (x(t|T)  -  £(t)_X(t|T)). 

Observe  that,  via  (14)  and  (7a) 

E(x  g(t|T)X/6<t|T)  ) 

=  P  B(t|T)p'^(t|T)E(x(t|t)  +  0_1(t  |t)  A  (t  | T) )  (x(t|t)  +  (P(t) 

»  P  6(t|T)P_p(t|T)  (rttJ-Pft)  +  £-1(t|T)0(t|T)  (Pit)  -  n(t)))_1~1(t) 

»  0  a 

Since  E  (x  (t  1 1) X’  (t  |  T) )  =0  ;  E  [x(t  |  t)x*  (t|  t)  ]  =  jr(t)  -  P(t)  . 

Consider  then 

E(x(t)  -  x  <t  |  T)  )£' (s)  =  -P  f(t|T)E[\_  ,.(t|T)j£  (s)  ]  since 

E(x(t)  -  >dt  I T)  )y ' (s)  =  £  for  all  observations  s  in  [0,T]  (18) 

=  P  5(tiT)E(TT1(t)x(t|T)-£(tlT))^'  (s) 

»  -P  6(t!T)Et2”1(t)x(t)  -  _\{t|T)l^Ms)  by  (18). 

Now  substituting  (1)  in  (18)  gives 

E[x(t)  -  j^(tjt)  -  £(t)_\  (t  |  T)  ]£' (s)  =  0  for  all  s  in  [0,T], 

(19) 
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Thus  (20b)  holds  if  and  only  if 

e(£(s)  =  0  T  _>  s  >  t 

This  clearly  holds  if  and  only  if 

E(^(s)v'  (c))  =  0  T  s  >  t  and  all  observed  points  a  in  [0 , t ) 
i.e.  if  and  only  if 

E(£(s)2/  (a))  =  E(£<s)£*  (c)  ) 
i.e.  if  and  only  if 

E(^(s)^'(o))  =  E(^(s)x'  (t)  )r_  ^  (t)E(x(t)^' (o)  ) 

for  T  >_  s  >  t  and  all  observed  a  in  (0,t). 

Now  x^  depends  linearly  on  thus  rewriting  the  last  expression  as 

E (</  <s ) >  =  E(£tx' (t))tf1(t)E(x(t)l£’ (s)) 

we  see  that  (20b)  implies  (20a) .  Perhaps  the  simplest  way  to  visualize  the  joint  wide 
sense  Markovian  requirement  is  in  terms  of  a  State  Space  Model  for  x(t) ,  ^(t) .  In  the 
next  section  it  is  shown  how,  with  a  State  Space  Model,  x  ^t|T),  x^(t|T)  can  be  computed 
by  backwards  filters. 


5.  BACKWARD  FILTERS 


First  continuous  observations  are  discussed.  Consider  the 


state  space  model 

d^(t)  =  H(t)x(t)dt  +  v(t)dt  (21a) 

x(t)  =  F (t) x(t)  +  G  (t)w(t)  (21b) 

where  v(t),  w(t)  are  white  noises  satisfying 

Ev(t)v'(s)  =  £(t-s),  E(w(t)w’(s))  =  £(t-s) 

Ev (t)w’(s)  =0  V  s,t  . 

According  to  the  orthogonality  condition  the  backwards  filtered  estimate  is  given  by 
(the  subscript  6  is  now  replaced  by  b  ) 

x^  (t  |  T)  =  /T  E(x(t)v£(s|T))vb(s|T)ds  (22a) 

where  ( s  | T )  is  the  backwards  innovations  (i.e.  over  [0,T]  vb(s|T)ds  is  linearly 

invertibly  equivalent  to  the  data  d^(a).  i.e.  the  Hilbert  space  spanned  by  "v^ts lT)ds" 
is  the  same  as  the  one  spanned  by  "d^(a)") . 

We  now  find  the  backwards  Kalman  Filter.  First 
=  -EOctOv^tlTnVhttlT) 


dx^  (t  |  T) 
dt~ 


+  /VatEUttJv^CslTJlVhCslTids 


(22b) 


To  compute  terms  such  as 

(t  |  T)  ,  E[x(t)v^(t|T)),  d/dtE  [  ( (x  (t)  (s  |  T) )  ] 

we  need  to  reorganize  equations  (21a)  and  (21b)  into  a  backwards  model  where  the  noises 
are  orthogonal  to  future  values  of  x(t) .  Recently  Verghese  and  Kailath  [13]  have  shown 
how  this  can  be  done. 

In  Appendix  B  it  is  shown  that  the  following  filter  results 


dJ^JtjT)  =  ^(tJ^ftlTidt  -  (t  |  T)H*  (t)v^(t|T)dt 

(23a) 

(t  j  T)dt  =  d£(t)  -  H(t)!^(t|T)dt 

(23b) 

condition  x^ItIt)  =  £  where 

-1 

-1 

F^t)  =  F(t)  +  G (t) G|(t)  tt_  (t)  =  -£(t)F' 

(t) (t)  . 

(230 

Also 

-dP,  (t|T)/dt  =  -P.  (t  I  T)F/!  (t)  -  F,  (t)  P.  (t  |  T)  +  G(t)G’(t)  -  P.  (t  |  T)  H 1  (t)  H  (t)  p,  (t  |  T)  (24a) 
b  — fc>  — b  — b  — b  —  — b  —  —  — b  1 

with  initial  condition  P^tIt)  =  *_(T>  alternatively 


(24b) 


dlf*  (t|T)/dt  =  -p“^(t|T)F  (t>  -  F'b(t)P"^(t|T)  -  H'(t!H(t) 

+  p'^ttiTjGttJG'  Ct)p”^(t|T)  . 

—  D  —  ~  —  D 

Now  equations  (23),  (24)  are  not  the  backwards  filter  equations  that  are  usually  given. 

The  equations  are  usually  given  for 

z(t  |T)  =  Pj^tt  iTJx^tt  | T)  (25a) 

or  for 

X  (t|T)  -  P^(t|T)*(t|T).  (25b) 

In  Appendix  c  it  is  shewn  that  filters  for  these  quantities  are 

dz(t|T)  =  -(E*(t)  -  p^(t  |T)G(t)G'  (t))*(t  |T)dt  -  H*  (t) d^(t)  (26a) 

dx^tlT)  -  (F(t)  +  P^  (t  |t)K' (t) H  (t ) ) (t  |  T)dt  -  P^ (t  | T)H ' (t)  dj^(t)  (26b) 

*  F(t)x  (t|T)dt  +  P  (t  |  T)H ' (t)  v  (t  |  T)dt  (26c) 

with  v  (t  |T)dt  =  d£(t)  -  H (t)  (t  |T) dt  and  initial  condition  arbitrary.  Also 
dP^1  (t  |T)/dt  =  -P^1(t|T)F(t)  -  F'  (t)  P^1  (t  |t)  -  H'tt)H(t) 

+  P^1  (t|T)G(t)G/  (tlP^1  (t IT)  (26d) 

with  initial  condition  P  1 (T | T)  «  0  . 

— 

Equations  (26a),  (26b),  (26d)  appear  for  example  in  Ljung  and  Kailath  [9,  respectively 
equations  (14),  (16),  (13)1  ("b"  in  their  notation  is  equivalent  to  "r"  of  the  present 

notation) . 

\ 
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In  Section  4  it  was  pointed  out 


6.  SOME^ADDITIONi^TVjO^IiTER^LIlg^^PgE^ENJ^T^uN^ 
how  there  is  a  pseudo-backwards  expression  analogous  to  (1)  namely 
x(t|T)  =  X  3(t l T)  +  P  g(t|T)X  g(t|T) 

where  also 

E(x  g(t|T)£'g(t|T))  =  0  . 

If  we  define 

Og(t]T)  =  EU  g(t|T)_X' (t|T>) 
then  it  follows  from  (30)  that 

p  _(t  j  T)  =  E(x(t)  -  x  ft  IT)  >  (x  (t)  -  x  ft  It))  1 

o  —  P  —  —  p 

=  P(tjT)  +  P  g<t|T)0  g(t|T)F  Jit]?). 

Thus  we  have  pseudo-backwards  analogues  of  (l)-(4). 

Now  however  we  can  retrace  the  argument  of  Section  3  to  produce  an 
—  ^ft  |T)  =xg(t|T)  +  (t  jT)  X_  g(t  ]T) 

(where  the  subscript  $  denotes  forward)  satisfying 
E(x(t)  -  x  g(t  | T) )  (x1  (t)  -  x^(t|T>)  =0 
E (x (t)  -  X  g(t|T))x'  <t)  *  0 

with  also 

(t  |  T)  =  P  g  (t  |  T)  +  P  ^(t  |T) 

where 

P  ^(t | T)  =  E(x(t)  -  x  Jt|T))  (x  (t)  -  x  (t  T) )  • . 

Further 

p-1(t|T)  =  P-1(t|T)  +  p”X(t|T). 

p  -"9 

Now  substituting  (13)  in  (37a)  while  equating  (37a)  to  (6)  gives 
P_:L(t)  =  TL"1(t)  +  P-1(t|T). 

Then  we  could  search  for  x  _(t|T)  satisfying 

pd 

E(x(t)  -  X  <tjT))x'  (t|T)  =  0 

—  66  —  66  — 

and  find 

*66(tlT>  =  JL<t)  (£(t)  +  P  |  T> )  -1x  t(t  |  T) 

as  •jell  as 


(30) 

(32) 

(33) 

estimate 

(34) 

(35a) 

(35b) 

(36) 

(37a) 

(37b) 

(38) 

(39) 
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(40) 


p-1  (t  |  T)  X  (t  |  T)  =  p'^tlDx  ,D(t|T) 

V  t  ~  do 

where 

P  ,(t|T)  =  E(x(t)  -  x  _.(t(T))x'(t). 

OP  ““  -  p  P  — 

However  we  find  in  Appendix  D  that 

x  .„(t|T)  =  x  (t  1 1)  (41a) 

-iiis  — 

SO 

P  r.(t|T)  =  P(t)  .  (41b) 

If  x^(tjT)  denotes  the  linear  least  squares  estimate  of  x(t)  given  the  data  in  [t,T] 

then  again  all  the  equations  just  derived  (except  (37b),  (41))  hold  with  6  replaced  by 

b  and  t  replaced  by  f  (in  (37b)  replace  P  ^  (t)  by  p  ■*"  (t  I T)  > .  Then  we  can  also 

—fob 

discover  the  wide  sense  Markov  condition  of  Section  5  by  requiring  that 

—  bb  (tlT)  =  • 

7.  Concision.  By  simple  argument,  a  number  of  two-filter-like  formulae,  that 
apply  to  fairly  general  nonstationary  processes,  have  been  derived  for  the  smoothing 
problem  of  linear  estimation.  In  general  the  filters  cannot  be  interpreted  in  a  back¬ 
wards  sense  unless  the  joint  signal/observations  model  is  wide  sense  Markovian. 


-14- 


Aggendix^^  Derivation  of  (17)  . 

From  (14) 

x  ( 1 1  T)  =  P  „(t|T)P-1(t|T)i  (t | T) 

_  P  “  P  *“P  ”  P 

=  P  gttlTJp'^tlT)  (x(tjt)  +  0-1(t|T)X_(t|T))  by  (7a) 

=  £  g(t  I  T)  (p'^ttlT)  +  £-1(t))x(t|t)  -  P  g(t  iTlir-1  (t)^(t  1 1) 

+  P  .(t|T)P_1(t|T)()~1(tjT)A  (tlT) 

~  P  —  P  —  — 

=  x(t|t)  +  P  g(t|T)  (P_p1(t|T)£"1(t|T)A_(t|T)  -  i"1  (t)x(t  1 1)  ]  by  (13) 
=  x(t|T)  -  P  g(t|T)  (r-1(t)x(t|t) 

-  P*p  (t|T)0_1(t|T)Mt|T)  +  p“g  (t|T)P(t)Mt|T)] 

-  x(t|T)  -P  (t|T)  (^(tJxttlt)  +if1(t)(P(t)  -  *(tm(t|T)]  . 

Since 

p"g(t|T)P(t)  -  p"p1(t|T)0"1(t|T) 

-  (P^ltlT)  +£*1(t))P(t)  -  p"p1(t|T)0"1(t|T)  by  (13) 

=  P_p(t|T)  (P(t)  -  £_1(t|T))  +  Tf1(t)P(t) 

=  -I  +  £_1(t)P(t)  by  (5). 

Thus 

x(t|T)  -  x  (t|T)  +  P  g(t|T)i‘1(t)  [x(t|t)  +  (P(t)  -  *(tm(t|T)]  . 
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Aggendix^B,.  Derivation  of  (23a),  (23b),  (24a),  (24b). 

In  the  present  situation  the  backwards  model  is 
djf(t)  =  H(t)x(t)dt  +  v(t)dt 
x(t)  =  x(t)  +  G  (t)w(t) 

where 

E(v(t)  v(s) )  =  6  (t-s)  ;  E(w(t)w'(s))  =  6  (t-s) 

E(v(t)w(s) )  =0  V  t,s 

E(v(t)^'  (s))  =  £  ;  E  (w(t)£*  (s) )  =0^  s  >  t  . 


Also 


so  that 

F  b  (t)  =  F(t)  +  G(t)G' (t)£_1(t)  =  -jnttjF1  (t)£-1(t) 

(Cl) 

F  (t)£(t)  -  -»(t)P*  (t). 

(C2 ) 

It  follows  that 

_ub  (t|T)dt  =  d£(t)  -  £b  (t|T)dt  »  d£(t)  -  H(t)  jc  b  (t  |  T)  dt 

ElxftJ^h'  (t  |  T)  ]  -  E{x(t)  [H(t>  (x(t)  -  xb(t|T)>  +  v(t)])' 

(23b) 

-  p  b (t |t)H' (t) 

(C4) 

d/dt  E[x(t)v'b<s|T)]  =  ^(t)Eix(t)v'b  (s|T)) 

(C5) 

Thus 

in  (22b) 

dx  b  (t|T)  =  F  b(t)x  b(t|T)dt  -  P  b(t|T)H' (t)  (d£(t)  -  H  (t)x  b(t  ]T)  )dt 

(C6) 

=  (£b(t)  +  P  b(t|T)H' (t)H(t))x  b(t|T)dt  -  P  b(t|T)H' (t)d£(t)  . 

<C7) 

Equation 

(23a)  follows  from  (C6)  and  (23b).  Next  (22a)  implies 

£<t)  -  Pfa(t|T)  =  E(x  b(t|T)x'b(t|T)) 

=  /T  E  [X  (t)  v*  ,(s  ! T)  )E(v  ,(s  I T) x ’ (t)  )ds 

^  D  D  ” 

(C8) 

Notice  that  P  ( T  | T)  =  tt^(T)  . 

Thus  differentiating  and  using  (C4)  gives  (using  for  "d/dt") 


£-  P  (t|l)  »  -P  b(t|T)H' (t)H(t)P  b(t|T)  +  Fb(t)(£(t)  -  Pb<t|T)) 

+  (£(t)  -  P  (t  |T)  )F'b(t)  . 


-16- 


Of  course 


1  ”  +£(t)£'(t)  +  G(t)G'  (t). 

Thus  using  (C2)  gives 

-P  (t  |  T)  =  -P  Jt|T)H’  (t)H(t)P  ft  |  T)  -  F  Jt)P  Jt  i  T) 

-  P  b<t|T)F’b(t)  +  G(t)G‘  (t)  . 

Also  it  follows  that 

d/dtP^ftlT)  =  -P-^  (t|T)P  JtjTjP^ltlT) 

=  -P-^(t|T)F  b(t)  -  £'b(t)p"^  (t  |T)  -  H(t)’H(t) 
+  P-b  (t|T)G(t)G‘ (t)P_b  (t|T) 


(24a) 


(24b) 
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Appendix  C. 


Derivation  of  (26a),  (26b),  (26d). 


Differentiate  z<t,T)  =  P  ,  (t!T)x.  (t  T)  to  find 
—  —  fc>  — fc> 


dz(t;T)/dt  =  £_J:t  !T)dx  b(t|T)/dt  +  dP*^  (t  |  T) /dtx  fa(  1 1 T) 

=  (tjT)(Fb(t)  +  P  b(tjT)H'  (t)H(t))x  b(t|T)  -  H 1  ( t )  dy_(t ) /dt 

-(p'b1(tiT,F  b(t)  -  F’b(t)P~b1(t|T)  +  H’ (t)H(t) 

-P_b  (t  |T)G(t)G'  (t)p“b  (t  It)  >xb(t|T) 
by  (23a),  (23b),  (24a) 

=  -  (F'(t)  -  p"b  (t)G(t)G‘ (t))P*b  (t|T)x.(t|T)  -  H' (t)d^(t)/dt 


-(F’(t)  _-"’1(t)G(t)G'  (t)  -  p'1(t|,T)G(t)G*(t))z(t|T)  -  H’ (t)dy (t)/dt 


by  (13) 


- (F *  (t)  -  p-1(tjT)G(t)G'  (t))z(tjT)  -  H'  (t)d£(t)/dt 


by  <C2). 

This  is  (26a).  For  (26d)  begin  with  (13) 


Notice  that 


P  *(t|T)  =  P~b(t|T)  -  TT_1(t) 


d/lt  f^t)  =  -(2_1(t)F(t)  +  F>  (t)t-1(t)  +  _n_1(t)G(t)G’ (t)_t~1(t)), 


d/dt  p-1(t|T)  =  -p“b  (t  |T)F  b(t)  -  F  yt)  P~b  (t  j  T)  -  H’(t)H(t) 

+  P_b  (t|T)G(t)G'  (t)P_b(t|T) 

-  t-1(t)F(t)  -F’(t)t-1(t)  -  £_1(t)G(t)G'  (t)i-1(t) 

=  -p"b  (t|T)F  b(t)  -  rb<t>p’b  (t  I T)  -  H'  (t)H(t) 

+  P_b  <tjT)G<t)G'  (t)P"b  (t|T)  +  Fyt)jL_1(t)  +  2_1(t)Fb(t) 

-  ir.”1(t)G(t)G,(t)if1(t) 
by  (C2). 


t!ow  apply  (13)  to  find 

=  -P~*(t  jT)F  b(t)  -  Fyt)P~^  (t  | T)  -  H' (t)H(t) 

+  P_^(t  jT)G(t)G'  <t)P-^(t|T) 

=  -P_^(t|T)(F(t)  +  G(t)G*  (t)i“1(t)) 

-(F*(t)  +  Tf1(t)G(t)G'(t))P-^(t|T)  -  H1  (t)H(t) 

+  p'^(t|T)G(t)G'  <t)P~*<t|T)  +  2_1  (t)G(t)G‘  (t)P~*  (t  |T) 

+  p‘r1(t|T)G(t)G' (t)£_1(t)  +  .^(tjGltJC*  (t)jTL“1(t) 

«  -p“1(t|T)F(t)  -  F' (t)P_1(t|T)  +  P_1(t|T)G(t)G’ (t)P*1(t|T) 

—  r  ~  —  — r  —  r  -  —  — r  —  — 

which  is  (26d) 

Clearly  also 

d/dttMt|T)  =  -Pr(t|T)dP_1(t|T)/dt  Pr(t|T) 

»  F(t)Pr(t|T)  +  P  r(t|T)F' (t)G(t)G*  (t) 

+  P  r(t|T)H*  (t)H(t)P  r  (t  |  T) . 

From  this  expression,  (26d)  and  differentiating  in  (25b),  then  (26b)  is  easily 


established. 


Aggned^J).  Derivation  of  (41a)  , 

Consider 

x.„(t|T)  =  tt  (t>  (t  (t)  +  P  (t|T))-1x  ,(t  |  T) 

—  P  D  —  “ *  —  <?  “  C 

=  tt  ( t )  ( n  (t )  +  P  (t|T))~1(x  -<t  |T)  +  0~}>-  Jt  |  T)  ) 

—  —  —  0  —  c  —  D  “  P 

by  (34) 

=  n_(t)  (^(t)  +  P  $(t|T))"1(x(t]T)  -  (P0(t|T)  -  £"?•  (t  |T)  =(t  |T) ) 

by  (30)  j 

=  tt  (t)  (ti  (t)  +  P  (t|T))_1(x(t|T)  +  P  (t|T)l  „(t|T)) 

—  ——(+,  —  —  $  —  £ 

by  (36) 

=  (7T_1  (t)  +  p"^(t|T))_1(p"^(t|T)x(t|T)  +  2_  g(t  | T)  ) 

=  P(t)[(p-1(t)  -  Tt-1(t))x(t|T)  +  a  ( 1 1 T )  1 

-  P 

by  (37b)  twice 

=  P(t)  (<P-1(t)  -  n'1(t))x(t|T)  +  i“1(t)x(t|T)  -  Mt|T)] 
by  (17b) 

=  P(t)  [P_1(t)x(t|T)  -  Mt|T)] 

=  P(t)  (P-I(t)  (x(t|t)  +P(t)l(t|T))  -  X_(t  |T)  1 
by  (1) 

=x(t|t). 

f 

* 

I 

i 

i  * 

*  I 

i 
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